Numerical behavior of NVIDIA tensor cores

نویسندگان

چکیده

We explore the floating-point arithmetic implemented in NVIDIA tensor cores, which are hardware accelerators for mixed-precision matrix multiplication available on Volta, Turing, and Ampere microarchitectures. Using Volta V100, Turing T4, A100 graphics cards, we determine what precision is used intermediate results, whether subnormal numbers supported, rounding mode used, order operations underlying performed, partial sums normalized. These aspects not documented by NVIDIA, gain insight running carefully designed numerical experiments these units. Knowing answers to questions important if one wishes to: (1) accurately simulate cores conventional hardware; (2) understand differences between results produced code that utilizes uses only IEEE 754-compliant operations; (3) build custom whose behavior matches of cores. As part this work provide a test suite can be easily adapted newer versions as well similar from other vendors, they become available. Moreover, identify non-monotonicity issue affecting floating point multi-operand adders normalized after each step.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

NVIDIA Tensor Core Programmability, Performance&Precision

The NVIDIA Volta GPU microarchitecture introduces a specialized unit, called Tensor Core that performs one matrix-multiplyand-accumulate on 4×4 matrices per clock cycle. The NVIDIA Tesla V100 accelerator, featuring the Volta microarchitecture, provides 640 Tensor Cores with a theoretical peak performance of 125 Tflops/s in mixed precision. In this paper, we investigate current approaches to pro...

متن کامل

Compressive Behavior of Fiber-Reinforced Honeycomb Cores

Honeycomb core sandwich panels have found extensive applications particularly in the aerospace and naval industries. In view of the recent interest in alternative, yet strong and lightweight materials, honeycomb cores are manufactured from sisal fiber-reinforced polypropylene (PP) composites and the out-ofplane compressive behaviour of these cores is investigated. The cell wall material is mode...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: PeerJ

سال: 2021

ISSN: ['2167-8359']

DOI: https://doi.org/10.7717/peerj-cs.330